Figure 1 

The novel gene as identified througli RACE analysis (894 bp) 



GGGAGTGGAGTGAGGGGTAACAAGATGGCGACCGAGACGGTGGAGCTCCATAAGCTAA 

AGCTTGCCGAACTAAAGCAAGAATGTCTTGCTCGTGGTTTGGAGACCAAGGGAATAAAG 

CAAGATCTTATCGACAGACTCCAGGCATATCTTGAAGAACATGCTGAAGAGGAGGCAAAT 

GAAGAAGATGTACTGGGAGATGAAACAGAGGAAGAAGAAACAAAGCCCATTGAGCTCCC 

TGTCAAAGAGGAAGAACCCCCTGAAAAAACTGTTGATGTGGCAGCAGAGAAGAAAGTGG 

TGAAAATTACATCTGAAATACCAGAGACTGAGAGAATGCAGAAGAGGGCTGAACGATTCA 

ATGTACCTGTGAGCTTGGAGAGTAAGAAAGCTGCTCGGGCAGCTAGGTTTGGGATTTCT 

TCAGTTCGAACAAAAGGTCTGTCATCTGATAACAAACCTATGGTTAACTTGGATAAGCTG 

AAGGAAAGAGCTCAAAGATTTGGTTTGAATGTCTCTTCAATCTCCAGAAAGTCTGAAGAT 

GATGAGAAACTGAAAAAGAGGAAGGAGCGATTTGGGATTGTCACAAGTTCAGCTGGAAC 

TGGAACCACAGAGGATACAGAGGCAAAGAAGAGGAAAAGAGCAGAGCGCTTTGGGATT 

GCCTGATGAAAAGTTCCTGATACTTTCTGTTCTCCAGTGTTTTCCATTTCTCTCCTTCTTC 

TTGGTCACATATATGCGTAAATGCACAGTCATGTGGCTACGTCCTGCGTCGCAATGAGG 

GAGCATGTACCGCAGGTACATCCATGAACTGCGGCAGCAGTTTGACTTATTGCTGTTTCA 

GCTTTAAGGTTGTTGTG I I I I I GTTTTTGATTATGTTGCTTGTTAATAAAAAAAAATAGAAA 

A 



Figure 2 

Amino acid sequence as translated from the novel gene (210 amino acids) 

MATETVELHKLKLAELKQECLARGLETKGIKQDLIHRLC5AYLEEHAEEEANEEDVLGDETEEE 
PTKPIPI P^/KFP='=PP'='<T^/n\/AAFKK\A/KITRFIPQTERMQKRAE RFNVPVSLESK KAARAAR 
FGISSVPTK GLSSDNKPMVNLDKLKERAQ RFGLNVSSISRK SEDDEKLKKRKER FGIVTSSA G 
TGTTEDTEAK KRKRAERFGIA 

Underlined sequences are amino acid sequences obtained by MSIMS analysis. 



Figure 3 



The sequence of the novel gene amplified through long distant PGR and used to 
construct the expression vector (873 bp). 



TGGAGTGAGGGGTAACAAGATGGCGACCGAGACGGTGGAGCTCCATAAGCTAAAGCTT 

GCCGAACTAAAGCAAGAATGTCTTGCTGGTGGTTTGGAGACCAAGGGAATAAAGCAAGA 

TCTTATCCACAGACTCCAGGCATATCTTGAAGAACATGCTGAAGAGGAGGCAAATGAAG 

AAGATGTACTGGGAGATGAAACAGAGGAAGAAGAAACAAAGCCCATTGAGCTCCCTGTC 

AAAGAGGAAGAAGCGCCTGAAAAAACTGTTGATGTGGCAGCAGAGAAGAAAGTGGTGAA 

AATTACATCTGAAATACCACAGACTGAGAGAATGCAGAAGAGGGCTGAACGATTCAATGT 

ACCTGTGAGCTTGGAGAGTAAGAAAGCTGCTCGGGCAGCTAGGTTTGGGATTTCTTCAG 

TTCCAACAAAAGGTCTGTCATCTGATAACAAACCTATGGTTAAC7TGGATAAGCTGAAGG 

AAAGAGCTCAAAGATTTGGTTTGAATGTCTCTTCAATCTCCAGAAAGTCTGAAGATGATG 

AGAAACTGAAAAAGAGGAAGGAGCGATTTGGGATTGTCACAAGTTCAGCTGGAACTGGA 

ACCACAGAGGATACAGAGGCAAAGAAGAGGAAAAGAGCAGAGGGCTTTGGGATTGCCT 

GATGAAAAGTTCCTGATACTTTCTGTTCTCCAGTGTTTTCCATTTCTCTCCTTCTTCTTGG 

TCACATATATGCCTAAATGCACAGTCATGTGCCTACGTCCTGCCTCGCAATGAGGGAGG 

ATGTACCCCAGGTACATCCATGAACTGCGGCAGCAGTTTGACTTATTGCTGTTTCAGCTT 

TAAGGTTGTTGTGTTTTTGTTTTTGATTATGTTGCTTGTTAAT 



Figure 4 




M ► ^ ► 

22 cycles of PCR 26 cycles of PGR 



Figure 5 
P-151 5'-Untranslated Region 

1 75 
CAGGGGCAGCAGTGATTATCTGAACTCGGATCTTTAAAATTGTGGTAGCTCTAAAGCTGATGATGTCTGGTTAGG 

76 150 
AAGTGGCTCTTGCCCGCCCCAGCCCCACCGCCAGTTCCTTAAGCCCGCCCCATGCCCCTCCCAGCTTCCTCCTCA 

151 225 
TGTTCATCGGTTTTTTCAGGGCTCCCTTCAACGCTCCCCTCTCAGTATTTAGGTCACCACTCCCTCGGCGCCCCT 

226 300 
TTCGCCTCCCACCATTTTTCCTCAGCAACCCTTACAGTCTTTGCAGCTCCTACCTGCCAGCTCAGATCCCCGTCC 

301 375 
GGCT ATGGGCGCGGCGCCGGCTACCACACCTGAAGTCTCCAGGAAGTAA CGCCTCTCCTTCTGCCCCTTTCCTGT 

376 450 
TGGAGGAACAGAATCAGCGCTGCCACCACCCATTGGTTGGTGGTCTGT AATGCAGAAGCACAGTTGGTTGCCATT 

451 525 
TCTGTCGTTCGCAAGATACAGTGCCCGCCCCTCTCCCAGTTCCACCTTTTGA AAGAGGTGGGGCAAGCTGCCTAG 

526 600 
AGAAGTGAGAGCGACGTCAGCTATTGACCA ATGGGAAGAGCTGATGGTATGGCGTGGGAGCAAGAGTGA CAACGA 

601 675 
TTGGTCAGCCTTGCATCTCTACGCCTAAGGCGGGAACTCCTGGAGGCGGAGGCCGCGGGTGGGGGGAGTGGAGTG 

676 

AGGGGTAACAAGATG P15I coding region 



(Total length: 690 bp) 



Sequence with asterisk: the 274 bp fragment 

Underlined sequences are the minicistrones or uORFs before the start of the P151 coding 
region with the start and stop codons in bold. 



